Recent Advances of Speech Databases Development Activity for Indian Languages

نویسندگان

  • S S Agrawal
  • K Samudravijaya
  • Karunesh Arora
چکیده

Development of Speech Corpora and acoustic–phonetic data bases are indispensable for any research and development work in spoken language systems. Systematic efforts have been made to create speech databases for some major languages of India. The paper attempts to present the status and the recent advancements made in corpora development for some of the Indian languages. Different types of databases developed include text corpora for speech, annotated/non-annotated speech corpora, acoustic-phonetic and labeled speech databases, special speech corpora etc. These have been developed for general purpose as well as task oriented applications. Databases of a few Indian languages have been developed in a well designed manner which includes adequate representation of textual / linguistic information, regional/dialectal variations, speaking styles and environments etc. These databases have been used for developing systems such as Text to Speech synthesis, Speech recognition, Speaker identification, speech secrecy, language translation and forensic applications etc.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

The IIIT-H Indic Speech Databases

This paper discusses the efforts in collecting speech databases for Indian languages – Bengali, Hindi, Kannada, Malayalam, Marathi, Tamil and Telugu. We discuss relevant design considerations in collecting these databases, and demonstrate their usage in speech synthesis. By releasing these speech databases in the public domain without any restrictions for non commercial and commercial purposes,...

متن کامل

Experiments with Unit Selection Speech Databases for Indian Languages

This paper presents a brief overview of unit selection speech synthesis and discuss the issues relevant to the development of voices for Indian languages. We discuss a few perceptual experiments conducted on Hindi and Telugu voices. 1 Role of Language Technologies Most of the Information in digital world is accessible to a few who can read or understand a particular language. Language technolog...

متن کامل

مراحل و نحوه ی تهیه ی دادگان های صوتی هجایی و دایفونی برای سامانه ی تبدیل متن به گفتار فارسی

Abstract Speech databases are part of the concatenative text to speech synthesis systems. Phonetic quality of the databases plays a significant role in the naturalness of the synthesized speech. This paper introduces two syllable and diphone speech databases for Persian and investigates the way of their development and their specifications and their advantages to each other. ...

متن کامل

Proceedings of Meetings on Acoustics

India possesses a large variety of languages and dialects spoken in different parts of the country. These languages possess some unique linguistic, phonological and phonetic properties different from European languages. Research is being done in several of Indian languages such as Hindi, Bangla, etc. to study the articulatory, acoustic, Phonetic and prosodic nature for the purpose of creating s...

متن کامل

A Review on Speech Corpus Development for Automatic Speech Recognition in Indian Languages

Cini kurian Department of computer Science, Al-Ameen college, Edathala,Aluva, Kerala [email protected] ------------------------------------------------------------------------ABSTRACT--------------------------------------------------------Corpus development gained much attention due to recent statistics based natural language processing. It has new applications in Language Technology, lingui...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2006